Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge

نویسندگان

  • Miao Fan
  • Qiang Zhou
  • Thomas Fang Zheng
چکیده

This paper focuses on an emerging research topic about mining microbloggers’ personalized interest tags from their own microblogs ever posted. It based on an intuition that microblogs indicate the daily interests and concerns of microblogs. Previous studies regarded the microblogs posted by one microblogger as a whole document and adopted traditional keyword extraction approaches to select high weighting nouns without considering the characteristics of microblogs. Given the less textual information of microblogs and the implicit interest expression of microbloggers, we suggest a new research framework on mining microbloggers’ interests via exploiting the Wikipedia, a huge online word knowledge encyclopedia, to take up those challenges. Based on the semantic graph constructed via the Wikipedia, the proposed semantic spreading model (SSM) can discover and leverage the semantically related interest tags which do not occur in one’s microblogs. According to SSM, An interest mining system have implemented and deployed on the biggest microblogging platform (Sina Weibo) in China. We have also specified a suite of new evaluation metrics to make up the shortage of evaluation functions in this research topic. Experiments conducted on a real-time dataset demonstrate that our approach outperforms the state-of-the-art methods to identify microbloggers’ interests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications

The outgrowth of social networks in the recent years has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations. We propose, evaluate and compare three approaches to engineering a hierarchical ontology over user intere...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Wikipedia Mining Wikipedia as a Corpus for Knowledge Extraction

Wikipedia, a collaborative Wiki-based encyclopedia, has become a huge phenomenon among Internet users. It covers a huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. As a corpus for knowledge extraction, Wikipedia’s impressive characteristics are not limited to the scale, but also include the dense link structure, word sense disambiguation bas...

متن کامل

WikiNet: A Very Large Scale Multi-Lingual Concept Network

This paper describes a multi-lingual concept network obtained automatically by mining for concepts and relations and exploiting a variety of sources of knowledge from Wikipedia. Concepts and their lexicalizations are extracted from Wikipedia pages. Relations are extracted from the category and page network, infoboxes and the body of the articles. The network consists of a central, language inde...

متن کامل

Wikipedia Link Structure and Text Mining for Semantic Relation Extraction

Wikipedia, a collaborative Wiki-based encyclopedia, has become a huge phenomenon among Internet users. It covers huge number of concepts of various fields such as Arts, Geography, History, Science, Sports and Games. Since it is becoming a database storing all human knowledge, Wikipedia mining is a promising approach that bridges the Semantic Web and the Social Web (a. k. a. Web 2.0). In fact, i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014